AITopics

Genre: Research Report (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsFeb-7-2026, 16:04:50 GMT

1757af1fe1429801bdf3abf5600f8bba-Paper-Conference.pdf

We define this previously overlooked problem asnetwork loafing. As social loafing will ultimately cause the low individual productivity and the reduced overall performance, network loafing will also hinder the performance of a given residual network and its sub-networks.

artificial intelligence, machine learning, residual network, (18 more...)

Country:

Oceania > Australia (0.04)
North America > United States (0.04)
Asia > China (0.04)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsNov-21-2025, 13:58:05 GMT

Language Modeling with Recurrent Highway Hypernetworks

Joseph Suarez

Neural Information Processing Systems http://nips.cc/

hypernetwork, machine learning, natural language, (17 more...)

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.84)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.71)

Bonet, David, Cara, Marçal Comajoan, Calafell, Alvaro, Montserrat, Daniel Mas, Ioannidis, Alexander G.

iLTM: Integrated Large Tabular Model

arXiv.org Artificial IntelligenceNov-21-2025

Tabular data underpins decisions across science, industry, and public services. Despite rapid progress, advances in deep learning have not fully carried over to the tabular domain, where gradient-boosted decision trees (GBDTs) remain a default choice in practice. We present iLTM, an integrated Large Tabular Model that unifies tree-derived embeddings, dimensionality-agnostic representations, a meta-trained hypernetwork, multilayer perceptrons (MLPs), and retrieval within a single architecture. Pretrained on more than 1,800 heterogeneous classification datasets, iLTM achieves consistently superior performance across tabular classification and regression tasks, from small datasets to large and high-dimensional tasks. After light fine-tuning, the meta-trained hypernetwork transfers to regression targets, matching or surpassing strong baselines. Extensive experiments show that iLTM outperforms well-tuned GBDTs and leading deep tabular models while requiring less task-specific tuning. By bridging the gap between tree-based and neural methods, iLTM offers a new framework for tabular foundation models for robust, adaptable, and scalable tabular learning.

artificial intelligence, deep learning, machine learning, (16 more...)

2511.15941

Country: North America > United States > California (0.46)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Goldfeder, Judah, So, Matthew, Lipson, Hod

Generating Auxiliary Tasks with Reinforcement Learning

arXiv.org Artificial IntelligenceNov-5-2025

Auxiliary Learning (AL) is a form of multi-task learning in which a model trains on auxiliary tasks to boost performance on a primary objective. While AL has improved generalization across domains such as navigation, image classification, and NLP, it often depends on human-labeled auxiliary tasks that are costly to design and require domain expertise. Meta-learning approaches mitigate this by learning to generate auxiliary tasks, but typically rely on gradient based bi-level optimization, adding substantial computational and implementation overhead. We propose RL-AUX, a reinforcement-learning (RL) framework that dynamically creates auxiliary tasks by assigning auxiliary labels to each training example, rewarding the agent whenever its selections improve the performance on the primary task. We also explore learning per-example weights for the auxiliary loss. On CIFAR-100 grouped into 20 superclasses, our RL method outperforms human-labeled auxiliary tasks and matches the performance of a prominent bi-level optimization baseline. We present similarly strong results on other classification datasets. These results suggest RL is a viable path to generating effective auxiliary tasks.

auxiliary task, machine learning, reinforcement learning, (18 more...)

2510.2294

Genre: Research Report (0.88)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)

Derajić, Bojan, Bernhard, Sebastian, Hönig, Wolfgang

ORN-CBF: Learning Observation-conditioned Residual Neural Control Barrier Functions via Hypernetworks

arXiv.org Artificial IntelligenceOct-14-2025

Control barrier functions (CBFs) have been demonstrated as an effective method for safety-critical control of autonomous systems. Although CBFs are simple to deploy, their design remains challenging, motivating the development of learning-based approaches. Yet, issues such as suboptimal safe sets, applicability in partially observable environments, and lack of rigorous safety guarantees persist. In this work, we propose observation-conditioned neural CBFs based on Hamilton-Jacobi (HJ) reachability analysis, which approximately recover the maximal safe sets. We exploit certain mathematical properties of the HJ value function, ensuring that the predicted safe set never intersects with the observed failure set. Moreover, we leverage a hypernetwork-based architecture that is particularly suitable for the design of observation-conditioned safety filters. The proposed method is examined both in simulation and hardware experiments for a ground robot and a quadcopter. The results show improved success rates and generalization to out-of-domain environments compared to the baselines.

artificial intelligence, machine learning, value function, (19 more...)

2509.16614

Country: Europe > Germany (0.28)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Neural Information Processing SystemsOct-9-2025, 17:43:29 GMT

067437c6d5d0369b6d09200bef89715b-Paper-Conference.pdf

experiment, hyperlogic, hypernetwork, (16 more...)

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Law (1.00)
Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Hadramy, Sidaty El, Cherkaoui, Oumeymah, Cattin, Philippe C.

cIDIR: Conditioned Implicit Neural Representation for Regularized Deformable Image Registration

arXiv.org Artificial IntelligenceJul-18-2025

Regularization is essential in deformable image registration (DIR) to ensure that the estimated Deformation Vector Field (DVF) remains smooth, physically plausible, and anatomically consistent. However, fine-tuning regularization parameters in learning-based DIR frameworks is computationally expensive, often requiring multiple training iterations. To address this, we propose cIDI, a novel DIR framework based on Implicit Neural Representations (INRs) that conditions the registration process on regularization hyperparameters. Unlike conventional methods that require retraining for each regularization hyperparameter setting, cIDIR is trained over a prior distribution of these hyperparameters, then optimized over the regularization hyperparameters by using the segmentations masks as an observation. Additionally, cIDIR models a continuous and differentiable DVF, enabling seamless integration of advanced regularization techniques via automatic differentiation. Evaluated on the DIR-LAB dataset, $\operatorname{cIDIR}$ achieves high accuracy and robustness across the dataset.

artificial intelligence, machine learning, pattern recognition, (14 more...)

2507.12953

Country: Europe > Switzerland (0.14)

Genre: Research Report (0.40)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.70)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Hwang, Sukjun, Wang, Brandon, Gu, Albert

Dynamic Chunking for End-to-End Hierarchical Sequence Modeling

arXiv.org Artificial IntelligenceJul-16-2025

Major progress on language models (LMs) in recent years has largely resulted from moving away from specialized models designed for specific tasks, to general models based on powerful architectures (e.g. the Transformer) that learn everything from raw data. Despite this trend, pre-processing steps such as tokenization remain a barrier to true end-to-end foundation models. We introduce a collection of new techniques that enable a dynamic chunking mechanism which automatically learns content- and context- dependent segmentation strategies learned jointly with the rest of the model. Incorporating this into an explicit hierarchical network (H-Net) allows replacing the (implicitly hierarchical) tokenization-LM-detokenization pipeline with a single model learned fully end-to-end. When compute- and data- matched, an H-Net with one stage of hierarchy operating at the byte level outperforms a strong Transformer language model operating over BPE tokens. Iterating the hierarchy to multiple stages further increases its performance by modeling multiple levels of abstraction, demonstrating significantly better scaling with data and matching the token-based Transformer of twice its size. H-Nets pretrained on English show significantly increased character-level robustness, and qualitatively learn meaningful data-dependent chunking strategies without any heuristics or explicit supervision. Finally, the H-Net's improvement over tokenized pipelines is further increased in languages and modalities with weaker tokenization heuristics, such as Chinese and code, or DNA sequences (nearly 4x improvement in data efficiency over baselines), showing the potential of true end-to-end models that learn and scale better from unprocessed data.

large language model, machine learning, natural language, (19 more...)

2507.07955

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.45)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)

Neural Information Processing SystemsJan-24-2025, 12:42:16 GMT

Reviews: Efficient Forward Architecture Search

This paper proposes novel neural architecture search method dubbed Petridish which is based on gradient boosting of "weak learners" (i.e. Originality: The main contribution of the paper is applying basic ideas from gradient-boosting of weak learners to the task of neural architecture search. This is an original idea, which allows a more guided exploration of the space of neural architectures compared to the random steps done, e.g. in evolutionary algorithms. Most related work is adequately discussed. The connection/differences to NAS methods combining network morphisms with evolutionary algorithms should be discussed in more detail as these explore the search space based on similar steps (modifying a model by small incremental additions) but select steps randomly and not based on gradient boosting.

efficient forward architecture search, significance, weak learner, (8 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.58)